home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Workbench Add-On
/
Workbench Add-On - Volume 1.iso
/
Dev
/
Oberon
/
texts
/
TechNotes.Text
< prev
Wrap
Text File
|
1994-05-12
|
53KB
|
1,360 lines
Departement Informatik
Institut für Computersysteme
Eidgenössische Technische Hochschule
Zürich
----------------------------------------------------------------------------
Oberon Technical Notes
Cuno Pfister (ed.)
The purpose of the Oberon technical notes is to provide the implementor
of an Oberon system with the experience gained during the implementation
efforts undertaken at the Institut für Computersysteme at ETH.
Furthermore they give an overview over work already done or under way.
This report contains the first five technical notes.
Table of Contents
1. Oberon Implementations page 28
2. An Integrated Heap Allocator/Garbage Collector page 30
3. Type Guards and Type Tests page 40
4. A Symmetric Solution to the Load/Store Problem page 42
5. Garbage Collection on Open Arrays page 48
----------------------------------------------------------------------------
1. Oberon Implementations
Cuno Pfister
The original Oberon implementation [1] has been realized by N. Wirth and
J. Gutknecht for the Ceres workstation [2]. Several ports of the system
to other machines have been completed since then. We will give a short
description of each of those projects. Differences to the original
implementations are described.
Ceres (National Semiconductor NS32x32)
The original implementation. Oberon is the basic operating system.
Module Display is written in assembly language. The garbage collector is
a mark-and-sweep garbage collector which runs only between commands,
i.e. it does not need to handle pointers on the stack. An access to a
freed module results in a trap on the Ceres-1 and Ceres-2, but goes
undetected on the Ceres-3. The heap allocator/garbage collector is
written in assembly language and linked together with the inner core
modules.
Sun SPARCstation (Sun SPARC)
This implementation [3] runs on top of SunOS as a Unix process. Oberon
takes over the whole screen. The display operations are based on the SUN
Pixrect routines. Oberon files are mapped to Unix files. A variant of
the buddy system strategy is used for heap allocation. The garbage
collector is a mark-and-sweep garbage collector which usually runs
between commands, but when NEW fails due to a memory shortage, the
garbage collector is started also. This collector also takes pointers on
the stack into account. This is done by treating each memory location on
the stack as a possible pointer value. To find out whether a memory word
is a pointer, it is subjected to some plausibility tests, such as
testing whether the value points to a possible block address in the heap
and whether a possibly valid type tag is found there. If the
plausibility tests succeed, the heap is traversed sequentially to find
the corresponding block. If the block is found, the tested memory
location is treated as a root for the garbage collector. Modules are
never freed but merely removed from the module list, i.e. an access to a
freed module cannot be detected and aborted. The loader and heap
allocator/garbage collector are written in Modula-2 and linked as a Unix
application.
Apple Macintosh II (Motorola MC68020)
This implementation [4, 5] runs on top of the MacOS as a (MultiFinder
friendly) application. Oberon runs in one Macintosh window. The display
operations are largely based on the Apple QuickDraw routines. Oberon
files are mapped to Macintosh files. An integrated allocator/collector
is used (see technical note # 2). Modules are never freed but merely
removed from the module list, i.e. an access to a freed module cannot be
detected and aborted. The type descriptors contain information about
procedure variables in records, such that a checked version of the
System.Free command could be implemented in the future. The loader, heap
allocator/garbage collector and some raster operations are written in
assembly language and linked as a Macintosh application.
DEC DECstation (MIPS R2000)
This implementation runs on top of Ultrix as a process. Oberon runs in
an X-window. The display operations are based on X-windows. Oberon files
are mapped to Ultrix files. An integrated allocator/collector is used
(see technical note # 2). Modules are never freed but merely removed
from the module list, i.e. an access to a freed module cannot be
detected and aborted. The loader is written in C and linked as a Unix
application.
IBM S/6000 (IBM S/6000)
This project has recently been started.
IBM PS/2 (Intel 80386)
This project has recently been started.
Others
Other Oberon compiler back-ends have been written by students, but the
system was not ported. The compiler back-ends available produce code for
the following processors:
- a virtual stack machine (similar to P-Code or M-Code)
- Intel 8086
- Intel 80386
- INMOS T800 Transputer
- C language (Oberon subset to C translator, for bootstrapping a compiler)
References
1. Wirth N, Gutknecht J (1989), The Oberon System, Software-Practice and
Experience, 19 (9), 857-893
2. Eberle H (1987), Development and Analysis of a Workstation Computer,
Ph. D. thesis no. 8431, ETH Zürich
3. Templ J (1990), SPARC-Oberon, User's Guide and Implementation, Report
133, ETH Zürich
4. Franz M (1990), The Implementation of MacOberon, Report 141, ETH
Zürich
5. Franz M (1990), MacOberon Reference Manual, Report 142, ETH Zürich
----------------------------------------------------------------------------
2. An Integrated Heap Allocator/Garbage Collector
Beat Heeb, Cuno Pfister
Abstract
Heap Allocation and Garbage Collection are fundamental services of an
Oberon implementation. It is shown how a simple and efficient
implementation of these services can be attained.
Introduction
Programs for personal computers become increasingly loaded with
features, most of them rarely needed. The reason for that is, apart from
the marketing pressure to advertise more features than the competition,
the desire to provide all the features that anyone might ever need or
want. The result is that programs become ever larger, more complex,
buggier, more expensive and sometimes delivered years after their
announcements. Even then they often fail to provide features useful for
a particular task. A solution to this problem lies in the development of
extensible programs. Extensibility here means that a program can provide
the customer with only the features needed most of the time and with
some means to extend it. If the customer needs some special service, he
(or usually a third party) can implement this service himself, without
having access to the original program's source code.
For example, imagine an extensible page layout program which supports
text boxes, draw boxes and bitmap boxes as standard box types, and
commands to operate upon them. A user may have special needs concerning
the available commands, like e.g. a command which aligns the selected
objects in a document in a special way. It should be possible for him to
write such a command, which operates on the exported document data
structure. This poses a subtle problem, though. The command implementor
may generate references, i.e. pointers, to an exported data structure,
and these references are not known to the basic layout program. This
means in particular that when the layout program disposes of the storage
used by a document, there may still be pointers around which reference
this storage. Such pointers are called dangling pointers . Dangling
pointers are one of the most frequent and most dangerous sources of
program malfunctions. Their use often results in the destruction of data
not belonging to the erroneous module, and the destruction may not be
detected for a long time. Such an error is difficult to track down,
consequently it is difficult to determine who is responsible for an
accident caused by the error. So extensibility leads to a loss of
control over references and thereby to an increased probability for
dangling pointers. We will come back to that shortly.
The user of a program like the one described above should also be able
to write an extension that supports special table boxes, for instance.
Such an extension consists of a module which implements the data type
Table, together with the particular behaviour of an instance of this
type, like displaying, storing and reading itself. A document might then
contain text boxes together with table boxes at the same time. This
example is typical in that a new data type, which is similar to existing
types, is introduced, and that variables of similar data types are
integrated in the same data structure (the document). The
object-oriented programming style is well suited for the implementation
of such a program, since it allows the definition of similar, i.e.
compatible types (e.g. Table is a subtype of Box) and since control can
be delegated to the subtype by means of the overriding facility (e.g.
Table implements its own Draw method). For our discussion the terms
record and object can be used interchangeably.
Our example has shown the integration of different objects types,
potentially implemented by different programmers, in the same data
structure. Close integration is obviously useful. On the other hand, it
magnifies the problems associated with malfunctioning objects by making
the integrity of a data structure dependant on a potentially large
number of object implementations. Especially errors which have non-local
effects are dangerous. This leads us to the conclusion that while
extensibility makes dangling pointers more probable, integration of
extensions makes the effects of dangling pointers graver.
It is possible to prevent this type of errors going undetected by using
a type-safe language. To prevent dangling pointers, the implementation
of a type-safe language must guarantee that every pointer variable is
initialized correctly and that a heap record is only released when there
are no pointers referencing it anymore. The latter task is performed by
a garbage collector. When a garbage collector can prove that a heap
record is not referenced anymore, it reclaims the corresponding heap
area for the storage allocator. Automatic garbage collection is known
for a long time, but has not found its way into production languages
like Pascal or C, not even into their object-oriented descendants. A
garbage collector may reduce the response time and the performance of a
program dramatically, or may require memory sizes several times as large
as would be the case without a garbage collector. Additional reasons why
garbage collectors are not popular are the problems caused by the lack
of type-safety in the mentioned languages, e.g. posed by untagged
variant records, and the complexity involved in handling arbitrary
record types.
We want to show how a simple and efficient garbage collector for Oberon
[1] can be written. Oberon is a type-safe language derived from Modula-2
which allows for an object-oriented programming style. Oberon is also
the name of an operating system and window system [2]. The kernel of an
Oberon system provides, among other things, a storage allocator and a
garbage collector. Oberon has first been implemented on the Ceres
workstation [3]. Our approach is not a radical departure from the Ceres
implementation, but rather a refinement which improves upon memory
utilization and implementation complexity.
Heap Allocation
A simple storage allocation algorithm is presented.
Small Blocks
Storage is allocated in blocks. Block sizes are multiples of a minimal
size B . The size s of an allocated block is at least the size of
variables of the record type bound to the pointer type of p . This size
is known during compilation. For the allocation it is rounded up to the
next multiple of B . There is a free list for all supported block sizes,
i.e. free blocks of the same size are linked by a simple linear list.
The free lists are anchored in a global array A (which could be declared
as ARRAY 1..N OF ADDRESS ) with element number i corresponding to the
free list for size i * B. Allocation of a block of size s consists in
removing a block from free list A[k] where
(k = s / B) (i: s / B i < k: A[i] = NIL) (A[k] NIL).
If k = s / B then the address of the block is returned. If k > s / B
then the block is split into two blocks, the first of size s and the
second of size r = k * B - s . The second block is inserted in the free
list A[r / B] and the address of the first block is returned.
The Ceres implementation is different in that it uses blocks with sizes
restricted to powers of two. This leads to an increased internal
fragmentation of the memory. In our scheme, each heap variable wastes
less than half of the minimal block size. This in contrast to a waste of
less than half of the particular block size. The most notable other
difference is that our scheme is simpler to implement.
Large Blocks
It is reasonable to restrict the size of array A such that it supports
only small blocks (e.g. less than about 100 bytes). Almost all
allocations are done with small block sizes, thus a less efficient
allocation strategy can be used for large blocks. A very simple solution
is to use A[N] as a free list for blocks of variable size (i.e. size³ (N
+ j) * B ) and performing a first fit allocation [4] whenever this list
must be used. Note that the support for this special case fits naturally
into the allocation procedure, it merely adds one line of code. The
following pseudo-code listing shows the complete allocation routine:
procedure Allocate(var a:address; size:longint);
var i:integer; r, l:address;
begin
i := min(size / B, N);
(* calculate index and restrict it to a maximal value *)
while (i < N) & (A[i] = NIL) do
INC(i)
end;
(* search smallest non-empty free list *)
l := adr (A[i]); a := l^;
(* address and value of pointer to first free block *)
while (a # NIL) & (a^.size < size) do
l := ADR(a^.next); a := l^
end;
(* first fit if i = N *)
if a # nil then
l^ := a^.next;
(* remove block from free list *)
if a^.size > size then
(* block must be split *)
i := min ((a^.size - size) / B, N); r := a + size;
r^.size := a^.size - size;
(* adjust size of residual block *)
r^.next := A[i]; A[i] := r
(* insert residual block in free list *)
end
end
end Allocate;
This algorithm doesn't support fast arbitrary block deallocation,
because a freed block cannot efficiently be merged with its
lower-address neighbour and because only simple linear lists are used
for the free lists, which prohibits fast removal of a block from its
free list. In the next chapter we will show how a garbage collector
circumvents the need for fast arbitrary block deallocation.
Garbage Collection
The Ceres implementation of Oberon uses a mark-and-sweep garbage
collector [5] for heap storage reclamation. In Oberon, most temporary
variables are local and therefore allocated and deallocated on the
stack. Thus relatively little garbage is produced compared to typical
Lisp or Smalltalk systems. This explains why the Ceres garbage collector
proved adequate in practice, contradicting the statement that
"Mark-and-sweep automatic storage reclamation does not seem to be
practical on contemporary (1988) computers" [6].
A mark-and-sweep collector works in two phases. In the mark phase, all
objects which still can be referenced are marked. In the sweep phase,
all heap blocks are traversed sequentially. The ones which have not been
marked are reclaimed.
Mark Phase
Let us first consider a simple recursive procedure, which marks all
objects reachable from a given pointer:
procedure Mark(q:address);
var off:address;
begin
if (q # NIL) & (Unmarked(q)) then
SetMark(q); off := FirstPointerOffset(q);
while off >= 0 do
Mark(mem[q + off]); off := NextPointerOffset(q, off)
end
end
end Mark;
This pseudo-code procedure uses four auxiliary procedures. The procedure
FirstPointerOffset determines the record field which contains the first
pointer. The procedure NextPointerOffset repeatedly yields the next
record field containing a pointer. A negative offset is used as a
terminating sentinel.
To implement FirstPointerOffset and NextPointerOffset it must be
possible to efficiently find out the offsets of a record's pointer
fields. These offsets are the same for all variables of this record's
type. Thus it is reasonable to provide a so-called type descriptor
containing a table with all these offset values. Every heap record now
needs a pointer to this type descriptor. This pointer is a hidden record
field called a type tag and is usually located at offset -PtrSize in the
record. The allocation procedure is extended such that it also
initializes this tag. (For every record type, the compiler reserves a
global variable anchoring the type descriptor. The contents of the
appropriate variable is passed to the allocation routine as an
additional parameter.)
Figure 1: Example of a record variable and its type descriptor
Figure 1 shows the descriptor of a type T . It contains the fields size
and ptable . ptable is a table of pointer offsets describing where in a
variable of type T a pointer can be found. This table, which varies from
type to type, is terminated by a negative valued sentinel. The procedure
Unmarked tests whether an object has already been marked, the procedure
SetMark marks an object under the assumption that it is unmarked (i.e.
Unmarked(q) is a precondition of SetMark(q) ). We won't go into details
about how marking is realized. It should be sufficient to say that one
bit of the type tag can be used for marking, usually either the sign bit
or the least significant bit.
The following version of the above procedure replaces FirstPointerOffset
and NextPointerOffset by an increment of the type tag by the size of a
pointer:
procedure Mark(q:address);
begin
if (q # NIL) & (Unmarked(q)) then
SetMark(q); Increment(Tag(q));
while mem[Tag(q)] >= 0 do
Mark(mem[q + mem[Tag(q)]]); Increment(Tag(q))
end;
RestoreTag(q)
end
end Mark;
This means that the type tag of a record changes during the mark phase
such that it always points to the offset to be processed and after the
offsets already processed (see Figure 2).
Figure 2. Type Tag during a Mark Phase
The loop over the offsets terminates when a negative offset is found.
The value of this offset can be initialized such that RestoreTag
simply becomes
Tag(q) := Tag(q) + mem[Tag(q)].
We now transform this procedure such that the guard (q # NIL) &
(Unmarked(q)) is moved outside of the mark procedure.
procedure Mark(q:address);
var r:address;
begin
(* (q # NIL) & JustMarked(q) *)
loop
Increment(Tag(q));
if mem[Tag(q)] >= 0 then
r := mem[q + mem[Tag(q)]];
if (r # NIL) & (Unmarked(r) then
SetMark(r); Mark(r)
end
else
RestoreTag(q);
return
end
end
end Mark;
JustMarked means that the object is marked, but none of its descendants.
To eliminate recursion, we introduce an explicit stack. The recursive
call is replaced by Push(q); q := r and the RETURN is replaced by
Pop(q):
procedure Mark(q:address);
var r:address;
begin
Stack := Empty;
loop
Increment(Tag(q));
if mem[Tag(q)] >= 0 then
r := mem[q + mem[Tag(q)]];
if (r # NIL) & Unmarked(r) then
SetMark(r); Push(q); q := r
end
else
RestoreTag(q);
if Stack = Empty then
exit
else
Pop(q)
end
end
end
end Mark;
The drawback of this procedure is the use of an additional stack, i.e.
of additional memory. In the algorithm of Deutsch/Schorr/Waite [7], the
stack is distributed to the individual pointer locations which are being
traversed. We use a variable p as stack pointer, i.e. as pointer to the
object containing the predecessor. The predecessor is contained in one
of the pointer fields, namely the one currently being processed. The old
value of this pointer field is either held in the auxiliary variable r
or on the stack also. This leads to the following replacements:
Stack = Empty ->
p = NIL
Push(q) ->
mem[q + mem[Tag(q)]] := p; p := q
Pop(q) ->
a := p + mem[Tag[p]]; r := mem[a]; mem[a] := q; q := p; p := r
procedure Mark(q:address);
var p, r, a:address;
begin
p := NIL;
loop
Increment(Tag(q));
if mem[Tag(q)] >= 0 then
r := mem[q + mem[Tag(q)]];
if (r # NIL) & Unmarked(r) then
SetMark(r); mem[q + mem[Tag(q)]] := p; p := q; q := r
end
else
RestoreTag(q);
if p = NIL then exit
else
a := p + mem[Tag[p]]; r := mem[a]; mem[a] := q; q := p; p := r
end
end
end
end Mark;
In the appendix there is a listing of an actual implementation of the
mark procedure written in MC68000 assembly language. This implementation
also takes into consideration that additional data (which is not
important for our discussion) must be stored in the type descriptor,
between the size field and ptable .
Concerning the type descriptors we should add that they may be treated
just as any other records. Thus they need their own type descriptors.
These meta type descriptors differ only in their size fields. They in
turn can share a common meta meta type descriptor, whose tag points back
to itself, i.e. it is its own type descriptor. It may be more practical
though to mark type descriptors in some way and to treat them as special
cases.
Sweep Phase
In the sweep phase the heap is traversed sequentially, block by block.
To do that, the size of all blocks must be known. The sum of a block's
address and its size yields the address of the next block. Since there
is a type tag at the beginning of each allocated block, such a block's
size can be found by inspecting the type descriptor to which the tag
points. A free block can be treated the same way, with the difference
that it is its own "type descriptor". How this can be done is shown in
Figure 3.
Figure 3: Structure of a free block
At the beginning of the sweep phase, the free lists are all cleared. The
sweep constructs completely new free lists by treating consecutive
unmarked blocks as single large blocks and inserting them in the
appropriate free lists. All marks are cleared.
procedure Scan;
var p: address;
begin
A[1..N] := NIL;
q := HeapStart;
repeat
while (q # MemSize) & Marked(q) do
ResetMark(q); q := q + mem[mem[q]]
end;
if q # MemSize then
p := q;
repeat
q := q + mem[mem[q]]
until (q = MemSize) or Marked(q);
Insert(p, q - p)
end
until q = MemSize
end Scan;
The procedure Insert(p, s) inserts a free block at address p in the free
list for size s . The invariant over this "merge-sweep" is that all free
blocks that have been traversed already are of maximal size, i.e.
merged. Only the block visited most recently might have to be merged
with the next one. This invariant is a principal difference between the
merge-sweep deallocation and the deallocation of arbitrary blocks.
Acknowledgements
We would like to thank H. Mössenböck, N. Wirth, R. Griesemer and W. Weck.
References
1. Wirth N (1988) The Programming Language Oberon. Software-Practice and
Experience, 18 (7), 661-670
2. Wirth N, Gutknecht J (1989) The Oberon System. Software-Practice and
Experience, 19 (9), 857-893
3. Eberle H (1987) Development and Analysis of a Workstation Computer,
Ph. D. thesis no. 8431, ETH Zürich
4. Knuth D (1973) The Art of Computer Programming, Addison-Wesley
5. McCarthy J (1960) Recursive Functions of Symbolic Expressions and
Their Computation by Machine, I, Comm. ACM, 3, 184-195
6. Ungar D, Jackson F (1988), Tenuring Policies for Generation-Based
Storage Reclamation, OOPSLA `88 Proceedings, 107-118
7. Schorr H, and Waite W (1967), An efficient machine-independent
procedure for garbage collection in various list structures, Comm. ACM,
10 (8), 501-505 Appendix
The following listing shows an implementation of the mark phase for one
root pointer in MC68000 assembly language. A complete implementation
would additionally have to iterate over all global (and possibly all
local) pointer variables as roots.
Note that the mark bit in the type tag is set during the whole traversal
of a record, thus it can be statically compensated for by using an
offset of -1 when addressing relative to the type tag.
* 68000 mark phase for garbage collector
* A0: pointer to father
* A1: pointer to node
* A2: temporary, for pointer rotation
* A3: tag or pointer to current pointer offset
* pointer offsets are usually accessed via A3 with an offset of
* Offset(ptable) - 4 - 1.
* The pointer is incremented before it is accessed, thus the subtraction
* of PtrSize.
* The subtraction of 1 comes from the set mark bit (bit # 0).
* D0: offset
* D1: temporary
PtrSize EQU 4 ;size of pointers and offsets
Tag EQU -4 ;offset of type tag
TagL EQU Tag+3 ;low byte of type tag
Mark EQU 0 ;mark bit (in TagL)
PTab EQU 36 ;ptable offset
Offset EQU PTab-PtrSize-1
Start: MOVE.L A1,D1 ; NIL test
BEQ End ; NIL
BSET.B #Mark,TagL(A1) ; test and set mark bit
BNE End ; marked
MOVE.L #0,A0 ; father := NIL
MOVE.L Tag(A1),A3 ; load first tag
BRA Loop
Up: ADD.L D0,A3 ; adjust tag
MOVE.L A3,Tag(A1) ; save tag
MOVE.L A0,D1 ; NIL test
BEQ End ; father = NIL (sentinel)
MOVE.L Tag(A0),A3 ; load father.tag
MOVE.L Offset(A3),D0 ; load offset
MOVE.L (A0,D0),A2 ; rotate pointers, step 1
MOVE.L A1,(A0,D0) ; rotate pointers, step 2
MOVE.L A0,A1 ; rotate pointers, step 3
MOVE.L A2,A0 ; rotate pointers, step 4
Loop: ADDQ.L #PtrSize,A3 ; address of next offset
MOVE.L Offset(A3),D0 ; load next offset
BMI Up ; negative sentinel reached, i.e. end of
; list
MOVE.L (A1,D0),A2 ; load son (and rotate pointers, step 1)
MOVE.L A2,D1 ; NIL test
BEQ Loop ; NIL
BSET.B #Mark,TagL(A2) ; test and set mark bit
BNE Loop ; marked
Down: MOVE.L A3,Tag(A1) ; save tag
MOVE.L A0,(A1,D0) ; rotate pointers, step 2
MOVE.L A1,A0 ; rotate pointers, step 3
MOVE.L A2,A1 ; rotate pointers, step 4
MOVE.L Tag(A1),A3 ; load new tag
BRA Loop
End:
----------------------------------------------------------------------------
3. Type Guards and Type Tests
Cuno Pfister
In Oberon, indirectly referenced record variables (i.e. referenced by
pointer or passed as VAR parameter) may have a different type at
run-time than the one which is declared statically. An Oberon
implementation must guarantee that the actual type of a record is an
extension of the record's declared type. This is done with the aid of
type guards [1]: A type guard tests whether the type of an indirectly
referenced record variable is an extension of some statically declared
type. If not, the program is aborted. A type test is similar to a type
guard, but instead of aborting a program it returns the value FALSE,
otherwise TRUE. We present a scheme which allows a very efficient
implementation of type guards and type tests.
A record type is represented at run-time by a type descriptor (see
Figure). In this type descriptor there is a table of pointers (ttable),
at a fixed offset and with a fixed size. The pointer in entry 0 points
to the type descriptor of the original base type of the extension
hierarchy (level 0 type). Entry 1 points to the first extension (level 1
type), entry 2 to the extension of the first extension, and so on. If
the type descriptor denotes a level n type, the first n + 1 entries are
used. The entries for higher level types are set to NIL. The level 0
entry may even be omitted, since a type guard on a base type is never
executed at run-time. Nevertheless it is recommended to include it, for
an example of where it can be useful see technical note # 4.
Obviously the depth of the extension hierarchy is limited by such an
arrangement. We recommend a table size of about 8 entries. This is
thought to be large enough for all practical purposes.
ttable
type descriptor
record
p
tag
size
ptable
The generated code can be described as follows:
the type guard v(T) becomes
p := Tag(v);
IF p^.ttable[L] # Tadr THEN HALT(18) END
and the type test v IS T becomes
p := Tag(v);
RETURN p^.ttable[L] = Tadr
where L is the extension's level, and Tadr the address of T 's type
descriptor. T is a hidden global variable in the module which declares
type T .
Tag(v) is different for a record referenced via pointer and for a VAR
parameter. The former contains the tag in the record itself, while the
latter's tag is passed as an implicit parameter, together with the
address of the record. A guard applied to a NIL valued pointer aborts
the program.
The WITH statement produces the same code as a type guard, the
difference is only relevant for the compiler. A similar scheme has been
presented in [1].
Hidden type guards are generated for the assignment of one record to
another, indirectly referenced record variable. In this case the
implementation must enforce strict equality of the record types on both
sides of the assignment. This leads to a simpler type guard, namely:
p := Tag(v);
IF p # Tadr THEN HALT(19) END
References
1. Cohen N H (1989), Type-Extension Type Tests Can Be Performed in
Constant Time, IBM Research Report
2. Wirth N (1988), Type Extensions, ACM Trans. on Programming Languages
and Systems, Vol. 10, No. 2, 204-214
----------------------------------------------------------------------------
4. A Symmetric Solution to the Load/Store Problem
J. Templ 1.3.91
The problem of loading and storing polymorphic data structures from or
to files using load and store messages is usually considered to be
asymmetric because an object existing in memory can receive a store
message but an object existing on a file cannot receive a load message.
Nevertheless a symmetric solution for the problem is proposed. It is
argued that a symmetric solution is both, more beautiful and more
flexible. The key to a symmetric solution is in the separation of the
information associated with an object into a header and a contents part.
The header contains the type information and the contents part contains
the data associated with the object. As the type information is not
maintained by the object itself but by the class the object belongs to,
it is straight forward to use class methods (ordinary procedures) to
handle the type information and to use instance methods (type bound
procedures or message handlers) to handle the contents of an object. In
the proposed solution all classes handle the type information the same
way, therefore one can think of the type handling methods as meta class
methods. When dealing with extended types, the problem of loading and
storing inherited state (possibly invisible to the extended type)
arises. Using inherited load and store methods (super-calls) solves the
problem but there is a subtle point to observe. When the type
information is handled by the object itself instead of by the (meta)
class, each overriding method is forced to use super calls. Also super
calls must be done before any other data is stored onto the file. The
more flexible symmetric solution follows postulate 1:
"An object never stores its own type information as response to a store
message"
Loading an object o using a Rider R may be done by a procedure
ReadObj(R, o) that generates an object according to the header
information. Then the object's data can be loaded by sending a load
message, e.g. o.Load(R).
$ object = header contents.
Storing an object o using a Rider R may be done by a procedure
WriteObj(R, o). Storing the contents of the object may be done by
sending a store message o.Store(R).
Let T be a subtype of Object and T1 a subtype of T. Let Load and Store
be procedures bound to T and Load' and Store' procedures bound to T1
overriding and invoking the inherited procedures. Storing of an object v
with dynamic type T1 to a file is then done by the following steps:
1. WriteObj(R, v);
2. v.Store(R) invokes Store'
2.1. v.Store^(R) in Store' invokes Store
2.1.1. data associated with type T is stored
2.2. additional data associated with type T1 is stored
Loading of an object of type T1 from a file into a variable v is done in
symmetric steps:
1. ReadObj(R, v);
2. v.Load(R) invokes Load'
2.1. v.Load^(R) in Load' invokes Load
2.1.1. data associated with type T is loaded
2.2. additional data associated with type T1 is loaded
ReadObj and WriteObj are responsible for internalizing and externalizing
the object's type information. For internalized objects this information
consists of a type tag, i.e. a pointer to a type descriptor node unique
for a type. For externalized objects the type tags are mapped to
reference numbers that refer to the type of the object. The first
occurence of a type reference is followed by the externalized type
descriptor consisting of the name of the module defining the type and
the type's name.
$ header = ref [module type].
$ ref = integer.
$ module = char {char} 0X.
$ type = char {char} 0X.
For easy maintenance of the reference numbers of types, a ref field in
the type descriptor is assumed. To avoid resetting this field before
each "store session", a global virtual clock (store counter) is
introduced and instead of the reference number in the type descriptors a
time stamp is actually used. This time stamp contains the clock value of
the type's last externalization. The reference number written to the
file is the difference between the time stamp and the start time
(clock0) of the stores which is defined by calling a Reset procedure.
For "load sessions" a type table and a type counter has to be maintained
which are also initialized by the same Reset procedure. For more details
see the prototype implementation below. The use of the Reset procedure
is restricted by postulate 2:
"The initialization of the generic load/store mechanism must be symmetric"
More accuratly, each Reset call preceding a store sequence must
correspond to a Reset call preceding a load sequence and vice versa.
Normally, the resets are done on the level of user activated commands.
A module Files1 is assumed to support persistent data portable across
different Oberon implementations. Files1 contains procedures for loading
and storing basic types (integers, reals, ...) in a portable way and it
contains procedures to load and store the empty object, i.e. it handles
the dynamic type information associated with an object derived from
Files1.Object.
Properties of the proposed solution:
- no install mechanism required
- efficient externalization and internalization
- no additional storage in internalized objects
- compact external representation
- easy to implement
- low space overhead in type descriptors (time stamp plus type name)
- flexible in the use of super-calls
- pure Oberon (language)
A prototype of module Files1 has been implemented under SPARC-Oberon:
MODULE Files1; (* J.Templ, 24.2.91 *)
IMPORT SYSTEM, Files, Kernel, Modules;
TYPE
Object* = POINTER TO ObjectDesc;
ObjectDesc* = RECORD END ;
TDesc = POINTER TO RECORD
m: Kernel.Module;
name: ARRAY 24 OF CHAR;
time: LONGINT (* < clock *)
END ;
VAR
module*, type*: ARRAY 24 OF CHAR;
(* most recent internalized type *)
clock, noftypes: LONGINT; (* clock0 = clock - noftypes *)
typTab: ARRAY 256 OF LONGINT;
...
PROCEDURE New(typetag: LONGINT): Object;
...
END New;
PROCEDURE ThisType(m: Kernel.Module; VAR type: ARRAY OF CHAR): LONGINT;
...
END ThisType;
PROCEDURE Reset*;
BEGIN noftypes := 0
END Reset;
PROCEDURE ReadObj* (VAR R: Files.Rider; VAR o: Object);
VAR ref, tag: LONGINT; m: Kernel.Module;
BEGIN
Read(R, ref);
IF ref = noftypes THEN
ReadString(R, module);
ReadString(R, type);
m := Modules.ThisMod(module);
IF m # NIL THEN tag := ThisType(m, type);
IF tag # 0 THEN typTab[ref] := tag; INC(noftypes);
o := New(typTab[ref])
ELSE R.res := 1
END
ELSE R.res := 2
END
ELSIF ref # -1 THEN o := New(typTab[ref])
ELSE o := NIL
END
END ReadObj;
PROCEDURE WriteObj* (VAR R: Files.Rider; o: Object);
VAR tag: TDesc; t: LONGINT;
BEGIN
IF o # NIL THEN
SYSTEM.GET(SYSTEM.VAL(LONGINT, o)-4, t);
tag := SYSTEM.VAL(TDesc, t - 36);
IF tag.time < clock - noftypes THEN
Write(R, noftypes);
Files1.Write(R, noftypes);
tag.time := clock;
INC(noftypes); INC(clock);
Files1.WriteString(R, tag.m.name);
Files1.WriteString(R, tag.name)
ELSE Write(R, tag.ref)
ELSE Files1.Write(R, tag.time - (clock - noftypes))
END
ELSE Write(R, -1)
END
END WriteObj;
BEGIN clock := 1; noftypes := 0
END Files1.
Example: loading and storing a binary tree using type bound procedures.
TYPE
Tree = POINTER TO TreeDesc;
TreeDesc = RECORD
(Files1.ObjectDesc)
left, right: Tree
END
PROCEDURE (t: Tree) Load (VAR R: Files.Rider);
BEGIN
Files1.ReadObj(R, t.left);
IF t.left # NIL THEN t.left.Load(R) END ;
Files1.ReadObj(R, t.right);
IF t.right # NIL THEN t.right.Load(R) END ;
END Load;
PROCEDURE (t: Tree) Store (VAR R: Files.Rider);
BEGIN
Files1.WriteObj(R, t.left);
IF t.left # NIL THEN t.left.Store(R) END ;
Files1.WriteObj(R, t.right);
IF t.right # NIL THEN t.right.Store(R) END ;
END Store;
PROCEDURE StoreCmd*;
...
Files1.Reset;
Files1.WriteObj(R, t);
IF t # NIL THEN t.Store(R) END
END StoreCmd;
PROCEDURE LoadCmd*;
...
Files1.Reset;
Files1.ReadObj(R, t);
IF t # NIL THEN t.Load(R) END
END LoadCmd;
Note that an asymmetric solution which stores the type information of an
object as response to the store message is not significantly shorter
because NIL pointers have to be handled explicitly. It also has the
disadvantage that the external representation of NIL has to be known
(unless a special procedure that stores the value NIL has been
introduced).
PROCEDURE StoreCmd*;
(* the asymetric solution *)
...
Files1.Reset;
IF t # NIL THEN t.Store(R)
ELSE Files1.Write(R, 0)
END
END StoreCmd;
The following presents the complete interface of module Files1 together
with a short description of the external data representation.
DEFINITION Files1;
(*J.Templ 31.1.91*)
(* module to support portable persistent data.
ReadInt, WriteInt: 2 Byte integers, little endian byte ordering
ReadLInt, WriteLInt: 4 Byte integers, little endian byte ordering
ReadSet, WriteSet: 4 byte sets, little endian byte ordering, ORD({0}) = 1
ReadReal, WriteReal: 4 byte IEEE reals, little endian byte ordering
ReadLReal, WriteLReal: 8 byte IEEE reals, little endian byte ordering
ReadString, WriteString: arbitrary length, null terminated
Read, Write: compact integers, 1 to 5 byte, cf. ETH Report 133, 1990 *)
IMPORT Files;
TYPE
Object = POINTER TO ObjectDesc;
ObjectDesc = RECORD END ;
VAR module, type: ARRAY 24 OF CHAR;
PROCEDURE Read (VAR R: Files.Rider; VAR i: LONGINT);
PROCEDURE ReadInt (VAR R: Files.Rider; VAR i: INTEGER);
PROCEDURE ReadLInt (VAR R: Files.Rider; VAR i: LONGINT);
PROCEDURE ReadLReal (VAR R: Files.Rider; VAR r: LONGREAL);
PROCEDURE ReadReal (VAR R: Files.Rider; VAR r: REAL);
PROCEDURE ReadSet (VAR R: Files.Rider; VAR s: SET);
PROCEDURE ReadString (VAR R: Files.Rider; VAR s: ARRAY OF CHAR);
PROCEDURE ReadObj (VAR R: Files.Rider; VAR o: Object);
PROCEDURE Write (VAR R: Files.Rider; i: LONGINT);
PROCEDURE WriteInt (VAR R: Files.Rider; i: INTEGER);
PROCEDURE WriteLInt (VAR R: Files.Rider; i: LONGINT);
PROCEDURE WriteLReal (VAR R: Files.Rider; r: LONGREAL);
PROCEDURE WriteReal (VAR R: Files.Rider; r: REAL);
PROCEDURE WriteSet (VAR R: Files.Rider; s: SET);
PROCEDURE WriteString (VAR R: Files.Rider; VAR s: ARRAY OF CHAR);
PROCEDURE WriteObj (VAR R: Files.Rider; o: Object);
PROCEDURE Reset;
END Files1.
----------------------------------------------------------------------------
5. Garbage Collection on Open Arrays
J. Templ 3.3.91
Traditional garbage collectors for Oberon ignore the problem of
traversing array structures on the heap by assuming that the compiler
forbids such constructs (implementation restriction). However, there are
good reasons for supporting pointers to arrays (fixed size and open
arrays) with arbitrary element types and at the same time eliminating
the implementation restriction. This paper proposes a solution for
traversing array data structures that affects the inner loop of the
garbage collector's mark phase in the common case of traditional record
nodes only by two simple assignments when following a pointer (two
register moves on most processors). The outer loop needs two additional
bit tests (within registers). Although the mark phase can be formulated
without nested loops, one can think of the highly time critical
operations performed on each pointer of a node (skipping nil pointers,
skipping pointers that point to marked blocks, and following a pointer)
as the "inner loop". The "outer loop" contains all operations performed
on each block, i.e. the inner loop and some operations when leaving the
node. In other words, the inner loop is executed O(N) times, where N is
the number of reachable pointers, and the outer loop is executed O(M)
times, where M is the number of reachable blocks. It is obvious that N
>= M and in some cases N >> M. Let T, Elem and P be types as defined
below and v be a variable of type P.
TYPE
T = ARRAY n0, n1 .. ni-1 OF Elem;
Elem = RECORD ... END ;
P = POINTER TO T;
The proposed algorithm assumes a storage block pointed to by v to look
like this:
v-4 tag points to Elem type descriptor
v --> data points to the first array element
size n0 * n1 * .. * ni-1 * SIZE(Elem)
arrpos reserved for the garbage collector
The type descriptor of an array block is the type descriptor of the
element type which must be a record. Multi-dimensional arrays are
"flattened", i.e. they are treated like big one-dimensional arrays.
Arrays within records are not affected, i.e. they are still expanded in
the type descriptor of the record. In addition to the existing block
kinds SysBlk (a block allocated with SYSTEM.NEW) and RecordBlk (a block
allocated with NEW), a new kind ArrayBlk is defined, i.e. now there are
three different kinds of heap blocks. The mark phase of a garbage
collector supporting only SysBlk and RecordBlk looks like the following
procedure Mark that traverses all nodes reachable from the node pointed
to by q. Mark expects the parameter q to point to a marked record block.
PROCEDURE Mark(q: Pointer);
VAR n: Pointer;
BEGIN
q.cnt := 0;
LOOP
IF Traversed(q) THEN Reset(q);
IF StackEmpty THEN EXIT END ;
Pop(q)
ELSE
Pointer(q, n);
IF (n # NIL) & Unmarked(n) THEN SetMark(n);
IF RecordBlk(n) THEN Push(q); q := n; q.cnt := -1 END
END
END ;
INC(q.cnt)
END
END Mark;
The meaning of the macros is as follows (some of them will be used later):
RecordBlk(q), ArrayBlk(q), SysBlk(q), Unmarked(q)
check a particular bit combination usually encoded in the type tag of q
SetMark(q)
set a bit combination in the type tag of q to signal that q is reachable
Traversed(q)
true iff all nodes reachable from q are marked.
offset := Offset(q, q.cnt);
RETURN offset < 0
Reset(q)
resets some information temporarily encoded in the type tag of q
StackEmpty
true if stack of partially traversed nodes is empty
Pointer(q, n)
set n to the next son of q to be traversed.
offset := q.tag.PtrTab[q.cnt];
n := mem[q+offset]
Push(q)
Push q onto the stack of partially traversed nodes.
offset := Offset(q, q.cnt);
mem[q+offset] := tos; tos := q
Pop(q)
Pop q from the stack of partially traversed nodes.
offset := Offset(tos, tos.cnt);
n := mem[tos+offset]; mem[tos+offset] := q; q := tos; tos := n
Offset(q, n)
the offset of the n-th pointer in block q
Elemsize(q)
the size of one array element.
Elemsize(q) should be expandable to q.tag.size, i.e. the size of the
element type should be available in the type descriptor. Most Oberon
implementations currently round the record size available in the type
descriptor to the next power of two or to the next number divisible by
16 or the like. In this case, the element size must be included in the
array block needing some additional space. q.cnt the number of sons of q
that are already traversed
To include array blocks, we apply the mark algorithm iteratively to all
array elements in the same way as it is done for records. For array
blocks q.arrpos is used to hold the offset of the current array element,
i.e. q.arrpos DIV Elemsize(q) holds the number of fully traversed array
elements. q.cnt gives the number of sons of the element pointed to by
q.arrpos that are already traversed. Mark now expects the parameter q to
point to a marked record or array block.
PROCEDURE Mark(q: Pointer);
VAR n: Pointer;
BEGIN
IF ArrayBlk(q) THEN q.arrpos := 0 END;
q.cnt := 0;
LOOP
IF Traversed(q) THEN
Reset(q);
IF ArrayBlk(q) & (q.arrpos + Elemsize(q) # q.size) THEN
INC(q.arrpos, Elemsize(q)); q.cnt := -1
ELSIF StackEmpty THEN EXIT
ELSE Pop(q)
END
ELSE
Pointer(q, n);
IF (n# NIL) & Unmarked(n) THEN
SetMark(n);
IF RecordBlk(n) OR ~SysBlk(n) THEN Push(q); q := n; q.cnt := -1 END
END
END ;
INC(q.cnt)
END
END Mark;
Unfortunately every macro that accesses a pointer in q needs to
distinguish between Record and Array blocks now. For the Pointer macro
the situation is like this:
Pointer(q, n) =
offset := Offset(q, q.cnt);
IF ArrayBlk(q) THEN n := mem[q.data + q.arrpos + offset]
ELSE n := mem[q + offset]
END
To avoid this distinction, we introduce an auxiliary variable t and an
auxiliary invariant:
H(t, q): <=> (RecordBlk(q) & t = q) OR (ArrayBlk(q) & t = q.data + q.arrpos)
Pointer(q, t, n) =
offset := Offset(q, q.cnt);
n := mem[t + offset]
Push(q, t) =
offset := Offset(q, q.cnt);
mem[t + offset] := tos; tos := q
Pop(q, t) =
offset := Offset(tos, tos.cnt);
IF ArrayBlk(tos) THEN t := tos.data + tos.arrpos
ELSE t := tos
END;
n := mem[t + offset]; mem[t + offset] := q; q := tos; tos := n
t must be set appropriately when entering a node, but it remains
unchanged while looping over nil pointers, marked pointers, or sysblks
within a node. The overhead when pushing a record node is only a single
assignment (shown in italics), the overhead for pushing an array node is
one additional test and two assignments. The overhead for returning from
a record node is two bit-tests and one assignment (shown in italics),
for finishing array elements, another test for detecting the end of the
array is needed.
PROCEDURE Mark(q: Pointer);
VAR n, t: Pointer;
BEGIN
IF ArrayBlk(q) THEN q.arrpos := 0; t := q.data ELSE t := q END ;
q.cnt := 0;
LOOP {H}
IF Traversed(q) THEN Reset(q);
IF ArrayBlk(q) & (q.arrpos + Elemsize(q) # q.size) THEN
INC(q.arrpos, Elemsize(q)); q.cnt := -1; INC(t, Elemsize(q))
ELSIF StackEmpty THEN
EXIT
ELSE
Pop(q, t)
END
ELSE
Pointer(q, t, n);
IF (n # NIL) & Unmarked(n) THEN
SetMark(n);
IF RecordBlk(n) THEN
Push(q, t); q := n; t := q; q.cnt := -1
ELSIF ~SysBlk(n) THEN
Push(q, t); q := n; q.arrpos := 0; t := q.data; q.cnt := -1
END
END
END ;
INC(q.cnt)
END
END Mark;
The modest additional complexity of the solution and the small runtime
overhead for record blocks (shown in italics) seem to be justified by
the additional functionality.
In practice it turns out that a sophisticated encoding of the predicates
ArrayBlk(q), SysBlk(q), RecordBlk(q), and Unmarked(q) has to be used to
get a fast collector. The encoding used in a prototype implementation in
SPARC-Oberon is explained below using an Oberon-like notation with
relaxed typing rules, e.g. q.tag is sometimes used as a set, sometimes
as a pointer, and sometimes as an integer. To avoid confusion, set
operators are indexed with s. It is assumed that sets, pointers and
integers have 4 bytes, and that bit i corresponds to value 2^i.
Predicate Encoding Invariant
Unmarked(q) ~(0 IN q.tag)
RecordBlk(q) ~(1 IN q.tag) RecordBlk(q) => ~SysBlk(q)
ArrayBlk(q) 1 IN q.tag
SysBlk(q) 2 IN q.tag SysBlk(q) => ArrayBlk(q)
free(q) 3 IN q.tag
These encodings follow the rule that an allocated unmarked record block
should not have any auxiliary bits set in the type tag, i.e. it should
have a valid type tag without masking some bits first. This is important
for fast type guards and type tests during program execution. For
counting the number of sons already visited, the technique proposed by
B. Heeb is used. Therefore the type tag is used as a pointer into the
offset table, too. The pointer offsets are 4 byte integers, i.e. only
the low order two bits of the tag can be used for Unmarked and
RecordBlk. Fortunately, SysBlk is never used when traversing a node, as
system blocks are supposed to have no pointers at all. free(q) is used
by the scan phase of the collector. Using 4 bits in the type tag forces
type descriptors to a 16 byte alignment. Note that the type tag can only
be used as a pointer after masking the low order bits. In the following
mem[p] means the 4 byte word at memory location p.
PROCEDURE Mark(q: Pointer);
VAR n, t, tos: Pointer; offset: LONGINT;
BEGIN
IF ArrayBlk(q) THEN q.arrpos := 0; t := q.data ELSE t := q END ;
INC(q.tag, PtrTabOffset); tos := NIL;
LOOP {H}
offset := mem[q.tag - {0, 1}];
IF offset < 0 THEN INC(q.tag, offset);
IF ArrayBlk(q) & (q.arrpos + Elemsize(q) # q.size) THEN
INC(q.arrpos, Elemsize(q)); INC(q.tag, PtrTabOffset - 4);
INC(t, Elemsize(q))
ELSIF tos = NIL THEN EXIT
ELSE Pop(q, t)
END
ELSE
n := mem[t + offset];
IF (n # NIL) & Unmarked(n) THEN
INCL(n.tag, 0);
IF RecordBlk(n) THEN
Push(q, t); q := n; t := q; INC(q.tag, PtrTabOffset - 4)
ELSIF ~SysBlk(n) THEN
Push(q, t); q := n; q.arrpos := 0;
t := q.data; INC(q.tag, PtrTabOffset - 4)
END
END
END ;
INC(q.tag, 4)
END
END Mark;
The macros Push and Pop are now defined as:
Push(q, t) =
mem[t + offset] := tos; tos := q
Pop(q, t) =
offset := mem[tos.tag - {0, 1}];
IF ArrayBlk(tos) THEN t := tos.data + tos.arrpos ELSE t := tos END;
r := mem[t + offset]; mem[t + offset] := q; q := tos; tos := r
This procedure may be implemented using assembly language or "pseudo
Oberon" (Oberon + module SYSTEM). However, there are still some
improvements possible. The repeated access to mem[q.tag - {0, 1}] in the
inner loop can be accelerated by introducing an auxiliary variable tag
initialized with q.tag - {0, 1}. In this case, q.tag has to be updated
whenever a node is left. This update operation and the bit tests before
Pop may be optimized by introducing another variable qtag with qtag =
q.tag * {0,1}.
There are also some common subexpressions that could be eliminated (by a
compiler).
PROCEDURE Mark(q: Pointer);
VAR n, t, tos: Pointer; offset, tag: LONGINT; qmask, ntag: SET;
BEGIN
IF 1 IN q.tag THEN
q.arrpos := 0; t := q.data; qmask := {0, 1}
ELSE t := q; qmask := {0}
END;
tag := q.tag - {0, 1} + PtrTabOffset; tos := NIL;
LOOP {H}
offset := mem[tag];
IF offset < 0 THEN
q.tag := tag + offset + qmask;
IF 1 IN qmask & (q.arrpos + Elemsize(q) # q.size) THEN
INC(q.arrpos, Elemsize(q)); INC(tag, offset + PtrTabOffset - 4);
INC(t, Elemsize(q))
ELSIF tos = NIL THEN EXIT
ELSE
qmask := tos.tag; tag := qmask - {0, 1}; qmask := qmask * {0, 1};
IF 1 IN qmask THEN t := tos.data + tos.arrpos
ELSE t := tos
END;
offset := mem[tag]; n := mem[t + offset]; mem[t + offset] := q;
q := tos; tos := n
END
ELSE
n := mem[t + offset];
IF (n # NIL) THEN
ntag := n.tag;
IF ~(0 IN ntag) THEN
q.tag := tag + qmask; n.tag := ntag + {0};
IF ~(1 IN ntag) THEN
mem[t + offset] := tos; tos := q; q := n; t := q;
tag := ntag + PtrTabOffset - 4; qmask := {0}
ELSIF ~(2 IN ntag) THEN
mem[t + offset] := tos; tos := q; q := n; q.arrpos := 0;
t := q.data; tag := ntag - {1} + PtrTabOffset - 4;
qmask := {0, 1}
END
END
END
END;
INC(tag, 4)
END
END Mark;
The overhead for record nodes is shown in italics. On modern processors
it consists of about two machine cycles when following a pointer to a
record node and six cycles when leaving a record node. Switching from
one array element to the next needs one additional compare, two storage
reads, one storage write, and three additions (15 - 20 cycles).